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Abstract 

For a given collection Q of directed graphs we define the join-reachability graph of G, denoted 
by J(Ci\ as the directed graph that, for any pair of vertices a and &, contains a path from a 
to h if and only if such a path exists in all graphs of Q . Our goal is to compute an efficient 
representation of J7(t?). In particular, we consider two versions of this problem. In the explicit 
version we wish to construct the smallest join-reachability graph for Q . In the implicit version 
we wish to build an efficient data structure (in terms of space and query time) such that we 
can report fast the set of vertices that reach a query vertex in all graphs of Q . This problem 
is related to the well-studied reachability problem and is motivated by emerging applications 
of graph-structured databases and graph algorithms. We consider the construction of join- 
reachability structures for two graphs and develop techniques that can be applied to both the 
explicit and the implicit problem. First we present optimal and near-optimal structures for 
paths and trees. Then, based on these results, we provide efficient structures for planar graphs 
and general directed graphs. 



1 Introduction 

In the reachability problem our goal is to preprocess a (directed or undirected) graph G into a 
data structure that can quickly answer queries that ask if a vertex b is reachable from a vertex 
a. This problem has numerous and diverse applications, including internet routing, geographical 
navigation, and knowledge-representation systems [20]. Recently, the interest in graph reachability 
problems has been rekindled by emerging applications of graph data structures in areas such as 
the semantic web, bio- informatics and social networks. These developments together with recent 
applications in graph algorithms |8l [9] have motivated us to introduce the study of the join- 
reachability problem that we define as follows: We are given a collection ^ of A directed graphs 
Gi = (Vi, Ai), 1 < i < X, where each graph Gi represents a binary relation Ri over a set of elements 
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V Vi in the following sense: For any a,b £ V, we have aRib if and only if b is reachable from 
a in Gi. Let TZ = Tl-iQ) be the binary relation over V defined by: alZb if and only if aRib for all 
i G {1, . . . , A} (i.e., b is reachable from a in all graphs in Q). We can view 7^ as a type of join 
operation on graph-structured databases. Our objective is to find an efficient representation of this 
relation. To the best of our knowledge, this problem has not been previously studied. We will 
restrict our attention to the case of two input graphs (A = 2). 

Contribution. In this paper we explore two versions of the join-reachability problem. In the 
explicit version we wish to represent TZ with a directed graph J = J{Q), which we call the join- 
reachability graph of G, i.e., for any a,b V, we have aTZb if and only if b is reachable from a 
in JT". Our goal is to minimize the size (i.e., the number of vertices plus arcs) of J. We consider 
this problem in Sections [2] and [3l and present results on the computational and combinatorial 
complexity of J. In the implicit version we wish to represent TZ with an efficient data structure 
(in terms of space and query time) that can report fast all elements a £ V satisfying aTZb for any 
query element b € V. We deal with the implicit problem in Section HI First we describe efficient 
join-reachability structures for simple graph classes. Then, based on these results, we consider 
planar graphs and general directed graphs. Also, in Appendix [B] and Appendix [C] we consider 
join-reachability structures for planar st-graphs and lattices. Although we focus on the case of two 
directed graphs (A = 2), we note that some of our results are easily extended for A > 3 with the 
use of appropriate multidimensional geometric structures. 

Applications. Instances of the join-reachability problem appear in various applications. For 
example, in the rank aggregation problem [5] we are given a collection of rankings of some elements 
and we may wish to report which (or how many) elements have the same ranking relative to a 
given element. This is a special version of join-reachability since the given collection of rankings 
can be represented by a collection of directed paths with the elements being the vertices of the 
paths. Similarly, in a graph-structured database with an associated ranking of its vertices we may 
wish to find the vertices that are related to a query vertex and have higher or lower ranking than 
this vertex. Instances of join-reachability also appear in graph algorithms arising from program 
optimization. Specifically, in [7] we need a data structure capable of reporting which vertices satisfy 
certain ancestor-descendant relations in a collection of rooted trees. Moreover, in [9] it is shown 
that any directed graph G with a distinguished source vertex s has two spanning trees rooted at 
s such that a vertex a is a dominator of a vertex b (meaning that all paths in G from s to b 
pass through a) if and only if a is an ancestor of b in both spanning trees. This generalizes the 
graph-theoretical concept of independent spanning trees. Two spanning trees of a graph G are 
independent if they are both rooted at the same vertex r and for each vertex v the paths from r to 

V in the two trees are internally vertex disjoint. Similarly, A spanning trees of G are independent if 
they are pairwise independent. In this setting, we can apply a join-reachability structure to decide 
if A given spanning trees are independent. Finally we note that a variant of the join-reachability 
problem we defined here appears in the context of a recent algorithm for computing two internally 
vertex-disjoint paths for any pair of query vertices in a 2-vertex connected directed graph [8]. 
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Preliminaries and Related Work. The reachability problem is easy in the undirected case 
since it suffices to compute the connected components of the input graph. Similarly, the undi- 
rected version of the join-reachability problem is also easy, as given the connected components of 
two undirected graphs Gi and G2 with n vertices, we can compute the connected components of 
i7({Gi,G2}) in 0{n) time. On the other hand, no reachability data structure is currently known 
to simultaneously achieve o(?i^) space and o(n) query time for a general directed graph with n 
vertices |20] . Nevertheless, efficient reachability structures do exist for several important cases. 
First, asymptotically optimal structures exist for rooted trees [1] and planar directed graphs with 
one source and one sink [12^ I17j. For general planar graphs Thorup |18] gives an 0(n log ?i)-space 
structure with constant query time. Talamo and Vocca [16] achieve constant query time for lattice 
partial orders with an 0(n-y/n)-space structure. 

Notation. In the description of our results we use the following notation and terminology. We 
denote the vertex set and the arc set of a directed graph (digraph) G by V{G) and A{G), respec- 
tively. Without loss of generality we assume that V{G) = V for all G € Q. The size of G, denoted 
by is equal to the number of arcs plus vertices, i.e., |G| = |y| + ji?!. We use the notation 
a b to denote that b is reachable from a in G. (By definition a -^g 0. for any a G V.) The 
predecessors of a vertex b are the vertices that reach b, and the successors of a vertex b are the 
vertices that are reached from b. Let P be a directed path (dipath); the rank of a £ P, rp(a), 
is equal to the number of predecessors of a in P minus one, and the height of a G P, hp{a), is 
equal to the number of successors of a in P minus one. For a rooted tree T, we let T{a) denote the 
subtree rooted at a and let ncaT{a, b) denote the nearest common ancestor of a and b. We will deal 
with two special types of directed rooted trees: In an in-tree, each vertex has exactly one outgoing 
arc except for the root which has none; in an out-tree, each vertex has exactly one incoming arc 
except for the root which has none. We use the term unoriented tree for a directed tree with no 
restriction on the orientation of its arcs. Similarly, we use the term unoriented dipath to refer to a 
path in the undirected sense, where the arcs can have any orientation. In our constructions we map 
the vertices of V to objects in a d-dimensional space and use the notation Xi{a) to refer to the ith 
coordinate that vertex a receives. Finally, for any two vectors C = {Ci, ■ ■ ■ ,£,d) and C = (Ci; • • • > Cd)) 
the notation < C means that < Q for i = 1, . . . ,d. 

1.1 Preprocessing: Computing Layers and Removing Cycles 

Thorup's Layer Decomposition. In [18] Thorup shows how to reduce the reachability problem 
for any digraph G to reachability in some digraphs with special properties, called 2-layered digraphs. 
A t-layered spanning tree T of G is a rooted directed tree such that any path in T from the root 
(ignoring arc directions) is the concatenation of at most t dipaths in G. A digraph G is t-layered 
if it has such a spanning tree. Now we provide an overview of Thorup's reduction. The vertices of 
G are partitioned into layers Lq, Li, ... , L^_i that define a sequence of digraphs G^, . . . , G^~^ 
as follows. An arbitrary vertex vq G V{G) is chosen as a root. Then, layer Lq contains vq and the 
vertices that are reachable from vq. For odd i, layer Lj contains the vertices that reach the previous 
layers Lj, j < i. For even i, layer Lj contains the vertices that are reachable from the previous 
layers Lj, j < i. To form G* for i > we contract the vertices in layers Lj for j < z — 1 to a single 
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root vertex rg; for z = we set rg = vq. Then is induced by Li, Lj+i and 7'o- It follows that 
each is a 2-layered digraph. Let i{v) denote the index of the layer containing v, that is, i{v) = i 
if and only if f G L,. The key properties of the decomposition are: (i) all the predecessors of v in 
G are contained in G'^^)-^ and G'W, and (ii) ^. \G'\ = 0{\G\). 

Removing Cycles. In the standard reachability problem, a useful preprocessing step that can 
reduce the size of the input digraph is to contract its strongly connected components (strong 
components) and consider the resulting acyclic graph. When we apply the same idea to join- 
reachability we have to deal with the complication that the strong components in the two digraphs 
may differ. Still, we can construct two acyclic digraphs Gi and G2 such that, for any a,b £ V, 

J{{Gi,G2}) ^ °^^y " '^Ji{Gi G2}) ^' I *l — 1^*1' ^ = This is accomplished 

as follows. First, we compute the strong components of Gi and G2 and order them topologically. 
Let G^, i = 1,2, denote the digraph produced after contracting the strong components of Gj. (We 
remove loops and duplicate arcs so that each G- is a simple digraph.) Also, let Gj denote the 
jth strong component of Gj. We partition each component G/ into subcomponents such that two 
vertices are in the same subcomponent if and only if they are in the same strong component in 
both Gi and G2. The subcomponents are the vertices of Gi and G2. Next we describe how to add 
the appropriate arcs. The process is similar for the two digraphs so we consider only Gi. 

Let G('^ , Gl'"^ , . . . , G^''^ be the subcomponents of G(, which are ordered with respect to the 
topological order of G2. That is, if x S G^'* and y € G^'^ , where i < i' , then in the topological 
order of G2 the component of x precedes the component of y. We connect the subcomponents by 
adding the arcs {G('\G(''^^) for 1 < i < Ij. Moreover, for each arc {Gi,G() in A{G[) we add 
the arc {Gl" ,G('^) to A{Gi), where Gl" is the last subcomponent of G{. See Figure [H It is 
straightforward to verify that a -^j b li and only if a and b are in the same subcomponent or the 
subcomponent of o is a predecessor of the subcomponent of b in both Gi and G2. 

2 Computational Complexity of Computing the Smallest J'{{Gi, G2}) 

We explore the computational complexity of computing the smallest ■y({Gi,G2}): Given two 
digraphs Gi = {V,Ai) and G2 = (F, ^2) we wish to compute a digraph J = J'{{Gi,G2}) of 
minimum size such that for any a,b £ V , a -^j b ii and only if a -^Gi b and a ~^G2 ^- We consider 
two versions of this problem, depending on whether J' is allowed to have Steiner vertices (i.e., 
vertices not in V) or not: In the unrestricted version V{J^) 5 V, while in the restricted version 
V{J) = V . Computing J is NP-hard in the unrestricted case. This is implied by a straightforward 
reduction to the reachability substitute problem, which was shown to be NP-hard by Katriel et 
al. [13]. In this problem we are given a digraph H and a subset U C V{H), and ask for the smallest 
digraph H* such that for any a,b £ U, a -^u* b and only if a -^h b. For the reduction, we let 
Gl = H and let G2 contain all the arcs connecting vertices in U only, that is, ^(G2) = U x U. 
Clearly, for any a,b £ U we have a -^j b ii and only if a -^h b. Therefore computing the smallest 
join-reachability graph is equivalent to computing H* . In the restricted case, on the other hand, 
we can compute J7 using transitive closure and transitive reduction computations, which can be 
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Gi 

Figure 1: The contracted digraphs G'l and G2 and their corresponding acychc digraphs Gi and G2- 

done in polynomial time (2]. (This is done as follows: First we compute the transitive closure 
matrices Mi and M2 of Gi and G2 respectively. Then we form the transitive closure matrix M of 
J by taking the and operation of corresponding entries in Mi and M2. Finally we compute the 
transitive reduction of the resulting transitive closure matrix M.) This implies the next theorem. 

Theorem 2.1. Let J he the smallest join-reachability graph of a collection of digraphs. The 
computation of J is feasible in polynomial time if Steiner vertices are not allowed, and NP-hard 
otherwise. 

The existence of Steiner vertices can reduce the size of J significantly. Consider for example a 
complete bipartite digraph G with V{G) = XUY and A{G) = X xY . This digraph has the same 
transitive closure as the digraph G' with V{G') = V{G)U{z] and A{G') = {(x, z),{z,y) \ x £ X,y £ 
Y}. In Section[3]we explore the combinatorial complexity of the unrestricted join-reachability graph 
and provide bounds for in several cases. 
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3 Combinatorial Complexity of j7({G'i,G2}) 



In this section we provide bounds on the size of J'{{Gi,G2}) in several cases. These results are 
summarized in the next theorem. 

Theorem 3.1. Given two digraphs Gi and G2 with n vertices, the following hounds on the size of 
the join-reachability graph J{{Gi,G2}) hold: 

(a) ©(nlogn) in the worst case when Gi is an unoriented tree and G2 is an unoriented dipath. 

(h) O(nlog^n) when both Gi and G2 are unoriented trees. 

(c) O(nlog^n) when Gi is a planar digraph and G2 is an unoriented dipath. 

(d) 0{n\o^ n) when both Gi and G2 are planar digraphs. 

(e) 0{nin\ogn) when Gi is a digraph that can be covered with ki vertex- disjoint dipaths and G2 
is an unoriented dipath. 

(f) 0{Kinlog^ n) when Gi is a digraph that can be covered with ki vertex- disjoint dipaths and 
G2 is a planar graph. 

(g) 0{KiK2n\ogn) when each Gi, i = 1,2, is a digraph that can be covered with m vertex- disjoint 
dipaths. 

In the following sections we prove Theorem 13. 1[ In each case we provide a construction of the 
corresponding join-reachability graph that achieves the claimed bound. In Section [4] we provide 
improved space bounds for the implicit representation of ^"({^1,^2}), i.e., data structures that 
answer join-reachability reporting queries fast. Still, a process that computes an explicit represen- 
tation of G2}) can be useful, as it provides a natural way to handle collections of more than 
two digraphs (i.e., it allows us to combine the digraphs one pair at a time). 

3.1 Two Paths 

We start with the simplest case where Gi and G2 are dipaths with n vertices. First we show that 
we can construct a join-reachability graph of size 0(n log re). Given this result we can provide 
bounds for trees, planar digraphs, and general digraphs. Then we show this bound is tight, i.e., 
there are instances for which r2(relogre) size is needed. We begin by mapping the vertices of y to a 
two-dimensional rank space: Each vertex a receives coordinates {xi{a), X2{a)) where xi{a) = r^ (a) 
and 2:2(0) = rQ^(a). Note that these ranks are integers in the range [0,re — 1]. Now we can view 
these vertices as lying on an n x n grid, such that each row and each column of the grid contains 
exactly one vertex. Clearly, aTZb if and only if (xi (a), 2:2(0)) < {xi{b), X2{b)). 



6 




Figure 2: The mapping of the vertices of two dipaths to 2d rank space and the construction of Jf^ 
Steiner vertices in Ji are white. 



Upper bound. We use a simple divide-and-conquer method. Let i be the vertical line with x\- 
coordinate equal to n/2. A vertex z is to the right of I if xi{z) > n/2 and to the left of £ otherwise. 
The first step is to construct a subgraph J'l of J that connects the vertices to the left of i to the 
vertices to the right of i. For each vertex b to the right of i we create a Steiner vertex b' and add 
the arc {b',b). Also, we assign to b' the coordinates (n/2, 3:2(6)). We connect these Steiner vertices 
in a dipath starting from the vertex with the lowest X2-coordinate. Next, for each vertex a to the 
left of i we locate the Steiner vertex b' with the smallest 2;2-coordinate such that ^2(0) < X2{b'). If 
b' exists we add the arc (a, b'). See Figure [21 Finally we recurse for the vertices to the left of i and 
for the vertices to the right of i. It is easy to see that J contains a path from a to 6 if and only if 
(xi(a), X2(a)) < {xi{b), X2{b)). To bound \ J'\ note that we have O(logn) levels of recursion, and at 
each level the number of added Steiner vertices and arcs is 0{n). Hence, the 0(n log n) bound for 
two dipaths follows. 

The case of two unoriented dipaths Gi and G2 can be reduced to that of dipaths, yielding 
the same 0(n log n) bound. This is accomplished by splitting Gi and G2 to maximal subpaths 
that consist of arcs with the same orientation. Then J is formed from the union of separate join- 
reachability graphs for each pair of subpaths of Gi and G2. The O(nlogn) bound follows from the 
fact that each vertex appears in at most two subpaths of each unoriented dipath, so in at most four 
subgraphs. We remark that our construction can be generalized to handle more dipaths, with an 
O(logn) factor blowup per additional dipath. 
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Figure 3: The digraph used in the lower bound proof of Section [2] for n = 16. The arcs are 
directed towards northeast. The 3;2-coordinate of each vertex is produced by reversing the bits of 
its xi-coordinate. 

Lower bound. Let Gi be any dipath, and let xi{a) = (a). Also let x\{a) denote the ith bit in 
the binary representation of xi{a) and let /? = [log2 n\ be the number of bits in this representation. 
We use similar notation for X2(a). We define G2 such that the rank of a in G2 is X2{a) = xi{a)^, 
where 2:1(0)^ is the integer formed by the bit-reversal in the binary representation of xi{a), i.e., 
X2{a) = x^~^~*(a) for < i < /3 — 1. Let V be the set that contains all pairs of vertices (a, b) that 
satisfy x\{a) = 0, x\{b) = 1 and x{(a) = 7^ i, for < i < /? — 1. Notice that for a pair 

(a, 5) G V, xi{a) < xi{b) and xi(o)^ < xi{b)^. Hence {xi{a),X2{a)) < {xi{b),X2{b)), which implies 
a b. Now let G be the digraph that is formed by the arcs (a, 6) € V. See Figure [3l Then 
a b only if a -^j- 6. Moreover, the transitive reduction of G is itself and has size il(?ilogn). 
We also observe that any two vertices in G share at most one immediate successor. Therefore the 
size of G cannot be reduced by introducing Steiner vertices. This implies that size of is also 
Q{n log n). 

3.2 Tree and Path 

Let Gi be a rooted (in- or out-)tree and G2 a dipath. First we note that the ancestor-descendant 
relations in a rooted tree can be described by two linear orders (corresponding to a preorder and 
a postorder traversal of the tree) and therefore we can get an O(nlog^n) bound on the size of 
using the result of Section [3.11 Here we provide an 0(n log n) bound, which also holds when Gi 
is unoriented. This upper bound together with the r2(n log n) lower bound of Section [3T] implies 
Theorem 13.1( a) . 
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Let T be the rooted tree that results from Gi after removing arc directions. We associate each 
vertex x £ T with a label h{x) = /igjI^)' the height of x in G2. If Gi is an out-tree then any 
vertex b must be reachable from all its ancestors a in T with h{a) > h{b). Similarly, if Gi is an 
in-tree then any vertex b must be reachable from all its descendants a in T with h{a) > h(b). We 
begin by assigning a depth-first search interval to each vertex in T. Let /(a) = [s{a),t{a)] be the 
interval of a vertex a £ T; s{a) is the time of the first visit to a (during the depth- first search) 
and t{a) is the time of the last visit to a. These times are computed by incrementing a counter 
after visiting or leaving a vertex during the search. This way all the s() and t{) values that are 
assigned are distinct and for any vertex a we have 1 < s(a) < t{a) < 2n. Moreover, by well-known 
properties of depth-first search, we have that a is an ancestor of 6 in T if and only if C I (a); if 
a and b are unrelated in T then 1(a) and I{b) do not intersect. Now we map each vertex a to the 
xi-axis-parallel segment S{a) = I (a) x h{a). 

As in Section [3.11 we use a divide-and-conquer method to build J'. We will consider Gi to be 
an out-tree; the in-tree case is handled similarly and yields the same asymptotic bound. Let £ be 
the horizontal line with X2-coordinate equal to n/2. A vertex x is above i if h{x) > n/2; otherwise 
(/i(x) < n/2), X is below £. We create a subgraph of J' that connects the vertices above £ to 
the vertices below £. To that end, for each vertex u above i we create a Steiner vertex u' together 
with the arc (n, u'). Let z be the nearest ancestor of n in T that is above £. If z exists then we add 
the arc {z',u'). Then, for each vertex y below £ we locate the nearest ancestor ti of y in T that is 
above £. If u exists then we add the arc {u' ,y). See Figure [H Finally, we recurse for the vertices 
above £ and for the vertices below £. 

It is not hard to verify the correctness of the above construction. The size of the resulting graph 
can be bounded by 0{n log n) as in Section [3. 11 Furthermore, we can generalize this construction for 
an unoriented tree and an unoriented path, and accomplish the same 0(n log n) bound as required 
by Theorem 13.1( a). (We omit the details which are similar to the more complicated construction 
of Section Sai) 

3.3 Two Trees 

The construction of Section 13.21 can be extended to handle more than one dipath. We show how 
to apply this extension in order to get an 0(n log^ n) bound for the join-reachability graph of two 
rooted trees. We consider the case where G\ is an out-tree and G2 is an in-tree; the other two cases 
(two out-trees and two in-trees) are handled similarly. 

Let Ti and T2 be the corresponding undirected trees. We assign to each vertex a two depth- 
first search intervals Ii{a) = [si(a), ti(a)] and /2(a) = [52(0)5^2(0)], where Ij{a) corresponds to Tj, 
j = 1,2. We create two linear orders (i.e., dipaths), Pi and P2, from the /2-iiitervals as follows: In 
Pi the vertices are ordered by decreasing S2-value and in P2 by increasing t2-value. Each vertex 
a is mapped to an xi-axis-parallel segment Ii{a) x ^2(0) x x^[a) (in three dimensions), where 
X2{a) = hp^{a) and ^3(0) = hp^{a). Then a -^j b if and only if Ii{b) C Ii{a) and (x2 (5), 3:3(6)) < 
(0:2(0), X3 (a)). See Figure [51 

Again we employ a divide-and-conquer approach and use the method of Section 13.21 as a subrou- 
tine. The details are as follows. Let p be the plane with X3-coordinate equal to n/2. We construct 
a subgraph Jp oi J that connects the vertices above p (i.e., vertices z with X3(z) > n/2) to the 
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6[1,16] 




1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 



Figure 4: The mapping of the vertices of a rooted tree and a dipath to horizontal segments in a 2d 
rank space and the construction of Jg. 
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i[l,16] e[l,16] 




1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 

Figure 5: The mapping of the vertices of two rooted trees to horizontal segments in a 3d rank 
space. The value in brackets above the segments correspond to the xa-coordinates. 
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Figure 6: The construction of Jp^i- 



vertices below p (i.e., vertices z with x^{z) < n/2). Then we use recursion for the vertices above p 
and the vertices below p. 

We construct J'p using the method of Section 13.21 with some modifications. Let i be the hor- 
izontal line with X2-coordinate equal to n/2. We create a subgraph J'p^i of J'p that connects 
the vertices above p and i to the vertices below p and i. To that end, for each vertex z with 
{x2{z),xs{z)) > {n/2, n/2) we create a Steiner vertex z' together with the arc {z,z'). Let u be the 
nearest ancestor of z in Ti such that {x2{u),X3{u)) > {n/2, n/2). If u exists then we add the arc 
{u',z'). Finally, for each vertex y with {x2{y),X3{y)) < {n/2, n/2) we locate the nearest ancestor z 
of y in Ti such that {x2{z),X3{z)) > {n/2, n/2). If z exists then we add the arc {z',y). See Figure 
El Finally, we recurse for the vertices above i and for the vertices below £. 

Now we bound the size of our construction. From Section 13.21 we have that the size of each 
substructure Jp is 0(n log n). Since each vertex participates in O(logn) such substructures, the 
total size is bounded by 0(n log^ n). 



3.4 Unoriented Trees 

We can reduce the case of unoriented trees to that of rooted trees by applying Thorup's layer decom- 
position (see Section [TTT]) . We apply this decomposition to both Gi and G2. Let G^,Gf, . . . , Gf'~^ 
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G 




Figure 7: An unoriented tree and its sequence of 2-layered tree. Fringe trees are encircled. 

be the sequence of rooted trees produced from Gi, i = 1,2, where each Gl is a 2-Iayered tree. See 
Figure [71 For even j, Gl consists of a core out-tree, formed by the arcs directed away from the 
root, and a cohection of fringe in-trees. The situation is reversed for odd j, where the core tree is 
an in-tree and the fringe trees are out-trees. We cah a vertex of the core tree a core vertex; we caU 
a vertex of a fringe tree (excluding its root) a fringe vertex. 

We build J as the union of join-reachability graphs Ji^j for each pair (G*]^, Gg). Each graph Ji^j 
is constructed similarly to Section [3.31 with the exception that we have to take special care for the 
fringe vertices. (We also remark that in general Jij / J{{G\,G2})-) A vertex z € V{G\) n ^^(Gg) 
is included in Ji^j if one of the following cases hold: (i) z is a core vertex in at least one of G\ and 
G2, or (ii) z is a fringe vertex in both G\ and and the corresponding fringe trees containing z 
are either both in-trees or both out-trees. Let Vij be the vertices in V{G\) H ^^(Gg) that satisfy 
the above condition. 

If Vi^j = then jTij is empty. Now suppose Vij / 0. First consider the case where the core 
of G\ is an out-tree. We contract each fringe in-tree to its root and let the new core supervertex 
correspond to the vertices of the contracted fringe tree. Let G\ be the out-tree produced from this 
process. Equivalently, if the core of G\ is an in-tree then the contraction of the fringe out-trees 
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produces an in-tree G\. We repeat the same process for Next, we assign a depth-first search 
interval Ii{z) to each vertex z in G\ and a depth- first search interval l2iz) to each vertex z in G2, 
as in Section 13.31 The vertices in Vij are assigned a depth-first search interval in both trees, and 
therefore can be mapped to horizontal segments in a 3d space, as in Section 13.31 Hence, we can 
employ the method of Section [3.31 with some necessary changes that involve the fringe vertices. Let 
z G Vij be a fringe vertex in at least one of G\ and If the fringe tree containing z is an in-tree 
then we only include in Jij arcs leaving z; otherwise we only include arcs entering z. 

Finally we need to show that the size of the resulting graph is O(nlog^n). This follows from 
the fact that each subgraph J'ij has size O(nlog^n) and that each vertex can appear in at most 
four such subgraphs. Theorem 13. iT b) follows. 

3.5 Planar Digraphs 

Now we turn to planar digraphs and combine our previous constructions with Thorup's reachability 
oracle jl8j . From this combination we derive the bounds stated in Theorem I3.ir c) and (d). First 
we need to provide some details for the reachability oracle of |18| . 

Let G be a planar digraph, and let G'', G^, . . . , be the sequence of 2-layered digraphs 

produced from G as described in Section [1.11 Consider one of these digraphs G*. The next step 
is to obtain a separator decomposition of G*. To that end, we treat G* as an undirected graph 
and compute a separator S whose removal separates G* into components, each with at most half 
the vertices. The separator S consists of three root paths of a spanning tree of G' rooted at tq. 
Because G* is 2-layered, each root path in 5 corresponds to at most two dipaths in G*. The key 
idea now is to process each separator dipath Q and find the connections between V{G^) and Q. 
For each v G V{G^) two quantities are computed: (i) from^[(5] which is equal to rQ{u), where u £ Q 
is the vertex with the highest rank in Q such that u -^Qi v, and (ii) tOt,[Q] which is equal to tq^u), 
where u £ Q is the vertex with the lowest rank in Q such that v -^g* Clearly there is a path 
from a to 6 that passes though Q if and only if to^ [Q] < from^ [Q] . The same process is carried out 
recursively for each component of G* \ V{S). The depth of this recursion is 0(log n), so each vertex 
is connected to O(logn) separator dipaths. The space and construction time for this structure is 
0{n log n). 

Now we consider how to construct a join-reachability graph when Gi is a planar digraph. We 
begin with the case where G2 is a dipath. First we perform the layer decomposition of Gi and 
construct the corresponding graph sequence G?,G},...,G^"\ Then we form pairs of digraphs 
Pi = {G^, G2} where G^ is a dipath containing only the vertices in V{G\) in the order they appear 
in G2. Clearly a b if and only if a •^j-^jjj.j 6 or a b, where is the join-reachability 

graph of Pi. Then J is formed from the union of J'q, . . . , j7^-i. 

To construct Ji we perform the separator decomposition of G\ , so that each vertex is associated 
with 0(log n) separator dipaths. Let Q be such a separator dipath. Also, let Vq be the set of vertices 
that have a successor or a predecessor in Q. We build a subgraph Ji^Q of Ji for the vertices in 
Vq; J^i is formed from the union of the subgraphs Ji^q for all the separator dipaths of G\. The 
construction of Ji^Q is carried out as follows. Let z £ Vq. If z has a predecessor in Q then we 
create a vertex z~ which is assigned coordinates xi{z~) = fiomzlQ] and X2{z~) = rG2{z), and add 
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the arc {z,z~). Similarly, if z has a successor in Q then we create a vertex z~^ which is assigned 
coordinates xi{z~^) = tOz[Q] and X2{z^) = rG^{z), and add the arc (z+jz). 

Now we can use the method of Section [XT] to build the rest of Ji^q, so that a -^j^q b if and 
only if {xi{a'^), X2{a'^)) < 3:2(6")). Let i be the vertical line with xi-coordinate equal 

to n/2. The first step is to construct the subgraph of Ji^Q that connects the vertices a"*" with 
xi{a~^) < n/2 to the vertices b~ with xi(b") > n/2. For each such b~ we create a Steiner vertex 
b' and add the arc {b',b~). Also, we assign to b' the coordinates (n/2, X2(6~)). We connect these 
Steiner vertices in a dipath starting from the vertex with the lowest X2-coordinate. Next, for each 
vertex a"*" with xi{a~^) < n/2 we locate the Steiner vertex b' with the smallest X2-coordinate such 
that X2{a'^) < X2{b'). If b' exists we add the arc (a+,6'). Finally we recurse for the vertices with 
xi-coordinate in [l,n/2) and for the vertices with xi-coordinate in (n/2,n]. 

It remains to bound the size of J. From Section \37\] we have |j7i,Q| = 0(|Vq| log \Vq\). More- 
over, the bound \Vq\ = 0{\V{G\)\ log | V(G*^)|), where the sum is taken over all separator paths 
of G\, implies \ J,\ < Eq = Oi\ViG\)\log'' \V{G\)\). Finally, since Ei 1^(^1)1 = 0(n) we 

obtain \J\ < J2i\Ji\ = 0{nlog^n). 

We handle the case where G2 is an unordered dipath as noted in Section 13.11 which implies 
Theorem 13. li fe). The methods we developed here in combination with the structures of Section [3.41 
result to a join-reachability graph of size 0(?ilog^ n) for a planar digraph and an unoriented tree. 
The same bound of O(nlog^n) is achieved for two planar digraphs, as stated in Theorem 13. If d). 

3.6 General Graphs 

A technique that is used to speed up transitive closure and reachability computations is to cover 
a digraph with simple structures such as dipaths, chains, or trees (e.g., see [1]). Such techniques 
are well-suited to our framework as they can be combined with the structures we developed earlier. 
We also remark that the use of the preprocessing steps of Section 11.11 reduces the problem from 
general digraphs to acyclic and 2-layered digraphs. In this section we describe how to obtain 
join-reachability graphs with the use of dipath covers. This gives the bounds stated in Theorem 
I3.1( e)-(g): similar results can be derived with the use of tree covers. Again for simplicity, we first 
consider the case where Gi is a general digraph and G2 is a dipath. 

A dipath cover is a decomposition of a digraph into vertex-disjoint dipaths. Let Pl,Pf, . . . P^^ 
be a dipath cover of Gi. For each vertex v and each path PI we compute from„[P|], i.e., rp^{z) 

where z G is the vertex with the highest rank in P{ such that z -^Gi v. Let P2 be the dipath that 
consists of the vertices in PI ordered by increasing rank in G2. Also, set fromt;[P2] = fpiiz) where 
z £ PI is the vertex with the largest rank such that rQ^{z) < rQ^{v). Let Vpi be set of vertices 
that have a predecessor in P|. We build a subgraph Ji of J that connects the vertices of PI to 
Vpi . Then J is formed from the union of the subgraphs Ji . For each z £ Vpi we create a vertex 
z~ which is assigned coordinates xi{z^) = from^[P{] and X2{z~) = from2[P2]! and add the arc 
(z~ , z). Also, for each z £ P^ we create a vertex z~^ which is assigned coordinates xi{z'^) = rpi{z) 
and X2{z^) = rpi{z), and add the arc {z, z'^). Now we can build a join-reachability graph, so that 
a -^j- b if and only if {xi{a~^), X2{a~^)) < (xi(6~), X2(6")), as in Section [331 
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The size of this graph is bounded by l^p|l log l^pfl = 0(^1^ log n), which implies the result 
of Theorem I3.1f e) . We can extend this method to handle two general digraphs and obtain the 
bound of Theorem 13.1( g) . The case where G2 is planar digraph is handled by combining the above 
method with the techniques of Section 13. 5| resulting to Theorem 13. If f) . 

4 Data Structures for Join-Reachability 

Now we deal with the data structure version of the join-reachability problem. Our goal is to 
construct an efficient data structure for J = ^({Gi,G2}) such that given a query vertex b it can 
report all vertices a satisfying a -^j h. We state the efficiency of a structure using the notation 
{s{n),q{n,k)) which refers to a data structure with 0{s{n)) space and 0{q{n,k)) query time for 
reporting k elements. In order to design efficient join-reachability data structures we apply the 
techniques we developed in Section [3l The bounds that we achieve this way are summarized in the 
following theorem. 

Theorem 4.1. Given two digraphs Gi and G2 with n vertices we can construct join-reachability 
data structures with the following efficiency: 

(a) {n,k) when Gi is an unoriented tree and G2 is an unoriented dipath. 

(b) (n,logn -|- k) when Gi is an out-tree and G2 is an unoriented tree. 

(c) (n log^ n, log log n -|- k) (for any constant £>{)), when Gi and G2 are unoriented trees. 

(d) (n log n, A; log n) when Gi is planar digraph and G2 is an unoriented tree. 

(e) {nlog^ n, klog^ n) when both Gi and G2 are planar digraphs. 

(f) {nKi,k) when Gi is a general digraph that can be covered with ki vertex-disjoint dipaths and 
G2 is an unoriented tree. 

(g) (n(Ki-|-log n), kKi logn) or {uki logn, klogn) when Gi is a general digraph that can be covered 
with Ki vertex- disjoint dipaths and G2 is planar digraph. 

(h) {n{Ki -\- K2), H1K2 + k) or {nKiK2, k) when each Gi, i = 1,2, is a digraph that can be covered 
with Ki vertex-disjoint dipaths. 

Next we provide the constructions that prove the bounds stated in Theorem 14.11 Throughout 
this section k denotes the size of the output of a join-reachability reporting query. 

4.1 Two Paths 

Let Gi and G2 be two dipaths. We use the mapping of Section [2l Recall that each vertex a is 
mapped to a point {xi{a),X2{a)) on an n x n grid so that a -^j b if and only if {xi{a), X2{a)) < 
{xi{b),X2{b)). This is a two-dimensional point dominance problem that can be solved optimally 
with a Cartesian tree [6]. Thus, we immediately get an {n,k) join-reachability structure for two 
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dipaths. We provide the details of this structure as we will need them in later constructions. A 
Cartesian tree T is a binary tree defined recursively as follows. The root of T is the point a 
with minimum X2-coordinate. The left subtree of the root is a Cartesian tree for the points b with 
xi{b) < xi{a) and the right subtree of the root is a Cartesian tree for the points b with xi{b) > xi{a). 
Clearly this structure uses linear space, and moreover it can be constructed in linear time [B]. The 
reporting algorithm uses the following property of Cartesian trees. Consider two points a and 
b, and let c be the point with minimum j;2-coordinate such that xi(a) < xi{c) < xi{b). Then, 
c = ncaT{a, b). Now let ^ be the point with the smallest xi-coordinate. In order to find all points 
a such that {xi{a) , X2{a)) < {xi{b), X2{b)) we first locate y = ncaT{C,b). The returned point y has 
the smallest j;2-coordinate in the rri-range [0,xi(6)]. If X2{y) > X2{b) then the answer is null and 
we stop our search. Otherwise we return y and search recursively in the xi-ranges [0, xi{y) — 1] and 
[xi{y) + l,xi{b)]. Using the fact that nearest common ancestor queries in a tree can be answered 
in constant time after linear time preprocessing jlOj . it follows that the time to report k vertices is 
Oik). 

As in Section 13.11 we can achieve the same bounds when Gi and G2 are unoriented dipaths by 
splitting them into maximal subpaths consisting of arcs with the same orientation. 

4.2 Tree and Path 

Next we consider the case where Gi is a rooted tree and G2 is a dipath. As in Section 13.21 
we note that a rooted tree can be described by two linear orders, and therefore we can get an 
(n, log n+k) solution using a three-dimensional dominance reporting structure [11]. Here we develop 
an alternative method that reduces the dimension of our problem and as a result it achieves an (n, k) 
bound. Furthermore, this method can be extended to give more efficient structures for two trees 
(compared to four-dimensional dominance reporting [11] ) . We will distinguish two cases depending 
on whether Gi is an out-tree or an in-tree. In any case, let T be the rooted tree that results from 
Gi after removing arc directions. We associate each vertex x £ T with a label h{x) = hG2{x), the 
height of X in G2- For an in-tree we wish to support the following query: Given a vertex b and a 
label j find all vertices a S T{b) with h{a) > j. Equivalently, for an out-tree the query algorithm 
needs to find all ancestors a of 6 in T with h{a) > j. We present a geometry-based method, which 
achieves 0(logn + A;) reporting time for an in-tree and 0{k) for an out-tree. An alternative method, 
based on a heavy-path decomposition of T [15], is given in Appendix lAl 

We use the mapping of Section 13. 2[ Each vertex a is assigned a depth-first search interval 
I{a) = [s{a),t{a)] in T and is mapped to the xi-axis-parallel segment S{a) = 1(a) x h{a). Now the 
choice of the structure we use depends on the arc directions in Gi. For an out-tree we have that 
a -^j b if and only if S{a) is above S{b) and the xi-projection of S{a) covers the rri-projection of 
S{b). The fact that interval endpoints are distinct implies that a -^j bif and only if the vertical ray 
Vf, emanating from (s(6), h{b)) towards the (+X2)-direction intersects S{a). Indeed, if a b then 
h{b) < h{a) and b G T{a), so I{b) C 1(a). Similarly, if S{a) is above S{b) and I{b) C /(a) then 
intersects S{a). Therefore, we have reduced our problem to a planar segment intersection problem. 
We can get an (n, k) structure by adapting either the hive graph of Chazelle [^ or the persistence- 
based planar point location structure of Sarnak and Tarjan [Tl]. Both these data structures require 
0(n log n) preprocessing time as they need to sort the endpoint coordinates. In our case sorting 
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is not necessary, since the xi-coordinates are produced in sorted order by the depth-first search, 
and the j;2-coordinates correspond to the height of the vertices in G2- Hence our preprocessing 
time is 0{n). Furthermore, the reporting time using either the hive graph or the persistence-based 
structure is 0(logn + fc), where the logn term is due to a point location query. In our case this 
term can be reduced to constant; point location is not necessary since the segment endpoints are 
the only possible query locations. Hence our reporting time is 0{k). 

We turn to the case where Gi is an in-tree. Here we have that a --^j bif and only if S{a) is below 
S{b) and the xi-projection of 5(6) covers the xi-projection of S{a). Since the interval endpoints 
are distinct we have a b if and only if the endpoints of S{a) are contained inside the rectangle 
[s{b),t{b)] X [0,h{b)]. This is a two-dimensional grounded range search problem (one side of the 
query rectangle always lies on the xi-axis). Since we have integer coordinates in [l,2?i] x [0,n — 1] 
we can get an (n, k) structure again with the use of a Cartesian tree [6]. 

The {n,k) bound is also achieved when Gi is an unoriented tree, as stated in Theorem I4.1f a). 
by applying the method of Section [3^ Let Gi,Gf, . . . , G^*'"^ be the sequence of 2-layered rooted 
trees produced from Gi. We construct a join-reachability structure for each pair Pi = 
where G\ is a dipath containing only the vertices in V{G\) in the order they appear in G2. A query 
for a vertex b needs to search the structures for the pairs P^(^h)_i and Pi,[b)- The structure for Pi is 
constructed as follows. We contract each fringe tree to its root and let the new core supervertex 
correspond to the vertices of the contracted fringe tree. Let G\ be the tree produced from this 
process. Next, we assign a depth-first search interval Ii{z) to each vertex z in G\, and map z to the 
xi-axis-parallel segment Ii{z) x X2{z), where X2{z) = hQi^{z). Using this mapping we can construct 

the data structures developed above depending on whether G\ is an out-tree or an in-tree. One 
important detail is that if G\ is an in-tree then the data structure for Pi does not store the segments 
that correspond to fringe vertices; the segment of such a fringe vertex z is needed however in order 
to answer a join-reachability query for z. Equivalently, if G\ is an out-tree and the query vertex b 
is an fringe in-tree of G\ then we do not search the structure for Pi. 

4.3 Two Trees 

We extend the method of Section 14.21 in order to deal with two rooted trees Gi and G2. We 
distinguish three cases depending on the type, in-tree or out-tree, of each tree. Then, by applying 
the layer decomposition method of Section 13.41 we can extend our structures to handle unoriented 
trees. This way we achieve the bounds stated in Theorem 14.1( b) and (c). 

Let Ti and T2 be the corresponding undirected trees. We assign each vertex a two depth-first 
search intervals Ii{a) = [si(a), ti(a)] and 12(0) = [^2(0), t2 (a)], where Ij{a) corresponds to Tj, for 
j = 1,2. We use the two intervals Ii{a) = [si(a), ti(a)] and /2(a) = [•52(0)5^2(0)] to map each 
vertex a to an axis-parallel rectangle R{a) = Ii{a) x l2{a). See Figure [D Again we exploit the fact 
that for any two vertices a and 6, the intervals Ij{a) and Ij{b) are either disjoint or one contains 
the other. If Ii(a) n /i(6) = or /2(a) n hib) = 0, then R{a) and R{b) do not intersect. Now 
suppose that both Ii{a) H Ii{b) / and /2(a) n l2{b) / 0. Without loss of generality, consider that 
^ A(o)- If -^2(6) ^ -^2(0) then R{b) is contained in R{a). Otherwise, if 12(0) ^ -^2(6) then 
both horizonal edges of R{a) intersect both vertical edges of R{b). Next, we distinguish three cases 
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Figure 8: Example of the mapping of Section 14.3 
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depending on the type of the two trees. 

First suppose that both Gi and G2 are out-trees. Then a b imphes b G Ti(a) and b G T2{a). 
So here we have Ii{b) C /i(a) and l2{b) C 12(a), thus R{b) is contained in R{a). In particular, the 
rectangle arrangement has the property that a -^j b if and only if R(a) encloses a corner of R{b). 
This property implies that we have a two-dimensional point enclosure: In order to report all vertices 
a such that a b we need to find all rectangles R{a) that enclose a corner of To that end, 

we can use the point enclosure structure of Chazelle [4] to get an (n, log n + k) join-reachability 
structure. 

Next, suppose that Gi is an out-tree and G2 is an in-tree. In this case a -^j b if and only if 
b G 7i(a) and a G 72(6), which implies Ii{b) C Ii{a) and 12(0) ^ hia)- Thus, R{a) intersects R{b). 
Furthermore, the properties of the depth-first search intervals imply that a -^j b if and only if the 
segment si(b) x 12(b) intersects /i(a) x S2{a). This is an orthogonal segment intersection problem, 
for which we can get an (n, k) join-reachability structure as in Section [4. 2i 

The last case is when Gi and G2 are in-trees. Now a -^j b if and only if a G Ti{b) and 
a G T2{b). Then we have Ii{a) C Ii{b) and 12(0) ^ -^2(6)1 which implies that a -^j b if and only if 
R{b) encloses a corner of R{a). Thus, our reporting query reduces to orthogonal range searching. 
Here the results of Alstrup et al. [3] imply an (nlog'^n,loglogn-|-/i;) join-reachability structure (for 
any constant e > 0). 

4.4 Planar Digraphs 

With the help of Thorup's reachability oracle [18] we can develop efficient structures for join- 
reachability in planar digraphs. Suppose first that G2 is a dipath. We perform the layer decompo- 
sition of Gi and construct the corresponding graph sequence Then we form pairs 
of digraphs Pi = {G\, G\} where G\ is a dipath containing only the vertices in V{G\) in the order 
they appear in G2. Clearly a -^j b if and only if a 6 or a j^^^j 6, where Ji is the join- 
reachability graph of Pi. For each pair Pi we build a join-reachability structure. In order to answer 
a reporting query for b we query the structures for -Pt(b)-i and P^q^^ independently and return the 
union of the results. It remains to describe the structure for a pair Pi = {G\^ G\}. We perform the 
separator decomposition of G\., so that each vertex is associated with O(logn) separator dipaths. 
For each vertex v G V{G\) we record a set S{v) containing the separator dipaths Q that reach v 
together with the number from^, [Q] (see Section 13. 5p . For each separator dipath Q we record the 
vertices v that reach Q together with the numbers tOt;[Q]. Next, for each separator dipath Q we 
build the data structure of Section 14.11 for the vertices that reach Q. Each such vertex a receives 
coordinates [xi{a) , X2{a)) where xi{a) = tOa[Q] and X2{a) is the rank of a in G2 among the vertices 
that reach Q. Now we can report the vertices that reach b through Q by finding the vertices a that 
satisfy (xi(a), X2(a)) < (from;, [Q], 2:2(6)). To that end, we use a Cartesian tree T as in Section WA\ 
Here we need to modify this structure in order to allow points with identical xi-coordinates. Since 
the xi-coordinates are integers in the range [0, |Q| — 1] we find for each integer i in that range the 
point with xi{ai) = i and minimum X2-coordinate. Then we build a Cartesian tree for the points 
flj, < i < \Q\ — 1. Also, we associate with aj a list of the remaining points with xi-coordinate 
equal to i in increasing X2-coordinate. Next, in order to initiate the search we also need to locate 
the vertex c with xi{c) = fromfo[(5]. We can do that easily in 0(1) time by using an array of size 
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\Q\ to map the xi-coordinates to the corresponding locations in T. Recah that the basic step of 
the reporting algorithm is to locate the point with smallest X2-coordinates in an xi-range If 
y is the corresponding point, then we check if X2{y) < fromf,[(5]- If this is the case, then we report 
y and search the list associated with y and report all points with X2{z) < fiomf,[Q]. Clearly the 
reporting time for k points is still 0{k). Also the required space and preprocessing is 0(|y(G^)|). 
Therefore, the asymptotic preprocessing time and space are the same as in Thorup's structure, i.e., 
O(nlogn). Finally we need to specify how to report all vertices a such that a -^j b. We query the 
structures for Pi,{b)-i a^id Pc{b)- To perform a query for Pi we use the list of separator dipaths that 
reach b, and for each such dipath Q we use the corresponding Cartesian tree to report the vertices 
a that satisfy {xi{a),X2{a)) < (from6[Q], X2(6)). Let kq be the number of reported vertices. The 
total reporting time is bounded by ^^g^sib) ~ 0{klogn). 

Using the results of Section 14.21 we can get join-reachability structures when G2 is a rooted or 
an unoriented tree. Let 12(0-) be the depth- first search interval assigned to each vertex a in T2, 
where T2 is the undirected version of G2. If G2 is an out-tree then we report the vertices a that 
satisfy tOa[Q] < fromfe[(5] and hib) ^ /2(a), which by Section can be done in 0{kQ) time. So, 
the total reporting time is 0(A;logn). Similarly, if G2 is an in-tree then we report the vertices a 
that satisfy tOa[(5] ^ from{,[(5] and l2{o) ^ l2{b), which again takes 0{kQ) time with the structure 
of Section l42l So, the total reporting time in both cases is bounded by 0(A:logn). Theorem 14. iT d) 
follows. With similar ideas we can obtain an (nlog^ n, /clog^ n) structure when G2 is also a planar 
digraph, as stated by Theorem 14.1( e). 

4.5 General Digraphs 

Here we examine how to obtain join-reachability strucures for general digraphs with the use of 
dipath covers. We begin with the case where G2 is a dipath. 

Let P^,P^, . . . P^^ be a dipath cover of Gi, and let be the dipath that consists of the vertices 
in Pf ordered by increasing rank in G2- Also, let Vpi be set of vertices that have a predecessor in 
PI- We build a join- reachability structure for each pair {Pi,P2} which we use in order to report 
the vertices in Pf that reach a query vertex in both Gi and G2. To that end, each vertex a in P^ is 
assigned coordinates xi{a) = Vpi ( a) and X2{a) = rp^{a), and we build a join-reachability structure 
for these vertices as in Section nil With this structure we can answer a reporting query for vertex b 
by finding the vertices a that satisfy (xi(a), X2(a)) < (from;, from;, for each i G {1,...,ki}. 
The reporting time is 0{k + ki) using 0[Kin) space. The reporting time can be reduced to 0{k) 
if we store for each vertex v a list I{v) of the indices i G {1, . . . , ^^i} such that the reporting query 
for V in the join-reachability structure for the pair {P|,P2} is non-empty. Then we only need to 
query the structures for i G I{v). The asymptotic space bound remains O^kiti). 

We can extend the above method in order to handle two general graphs. The resulting bounds, 
however, are interesting only when the product K1K2 is small compared to n, where K2 is the number 
of disjoint dipaths in a dipath cover of G2. Specifically, we can get either 0((ki + K2)n) space and 
0{kiK2 + k) reporting time, or 0{{KiK2)n) space and 0{k) reporting time. (In the latter structure 
we improve the reporting time by storing for each vertex v the pairs of dipaths in the cover of Gi 
and G2 that contain a common predecessor of v.) This implies Theorem I4.ir h). By combining the 
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dipath cover method with the techniques of Section [4.21 we obtain the bound of Theorem I4.ir f). 
Similarly, the techniques of Section 14.41 imply Theorem I4.ir g) . 

5 Conclusions and Open Problems 

We explored the computational and combinatorial complexity of the join-reachability graph, and 
the design of efficient join-reachability data structures for a variety of graph classes. We believe that 
several open problems deserve further investigation. For instance, from the aspect of combinatorial 
complexity, it would be interesting to prove or disprove that an 0(m •polylog(?i)) bound on the size 
of the join-reachability graph G2}) is attainable when Gi is a general digraph with n vertices 

and m arcs and G2 is a dipath. Another direction is to consider the problem of approximating the 
smallest join-reachability graph for specific graph classes. From the aspect of data structures, we 
can consider the following type of join-reachability query: Given vertices h and c, report (or count) 
all vertices a such that a ~^Gi ^ and a -^02 ^■ 
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Appendices 



In the Appendices we provide additional join-reachability data structures. In Appendix [A] we 
apply the heavy-path decomposition of trees [H] in order to get alternative join-reachability data 
structures for trees and paths. In Appendix [B] we consider the case of planar sf-graphs ^7\, and in 
Appendix ICl we consider lattices. 

A Join-Reachability for Trees based on Heavy-Path Decomposi- 
tion 

Let T be the rooted tree that results from Gi after removing arc directions. We develop a method 
based on partitioning T into heavy paths [15]. This is done as follows. A child a' of a is heavy if 
|7'(a')| > \T{a)\/2, and light otherwise. The light level of a vertex a is the number of light vertices 
on the path from a to the root of T. Each vertex has at most one heavy child and its light level is 
O(logn). The heavy paths are formed by the edges connecting a heavy child to its parent and the 
topmost vertex of a heavy path is light. 

First we consider the case where G2 is a dipath, and then the case where G2 is a (in- or out-)tree. 

A.l Tree and Path 

Based on the heavy-path decomposition of T, we describe a structure with 0{klogn) reporting 
time for an in-tree and 0(logn + k) reporting time for an out-tree. These bounds are inferior to 
the ones given in Section [4. 2| but are achieved with simpler structures. 

Consider the in-tree query first. Here our method is inspired by a routing scheme for trees by 
Thorup and Zwick [19]. Let h{T{a)) be the maximum label in T{a). Obviously, we need to search 
r(a) only if h[T{a)) > j. Let h'{T{a)) be the maximum label in T(a) \ T{a'), where a' is the heavy 
child of a (if it exists). The search proceeds top-down starting from b. Let a be the current vertex 
such that h[T{a)) > j. If h{a) > j, we report a. Then we identify the light children c of a such 
that h{T{c)) > j. Moreover, if a is the topmost vertex of its heavy path P, then we identify the 
vertices d £ P such that h'{T{d)) > j. Then, we repeat this process at each vertex that we have 
identified. In order to locate these vertices quickly, for each vertex a we order its light children c 
by h{T{c)), and for each heavy path P we order the vertices d G P by h'{T{d)). Note that when 
we visit a light child c, the light level increases and there is at least one x G T[c) with h{x) > j. 
The 0(A;logn) bound follows. 

For the out-tree query we use the same heavy-path decomposition and construct a Cartesian 
tree for each heavy path P. (See Section 14. ip . The Cartesian tree for P stores the vertices in a G P 
according to coordinates {xi{a),X2{a)) = {hp{a),hG2{ci))- Furthermore, each vertex has a pointer 
to the topmost vertex of its heavy path, and each topmost vertex of a heavy path has a pointer 
to its parent in T. Let h be the query vertex and let Q be the tree path from the root of T to h. 
The goal is to identify the vertices a & Q with h{a) > j. We locate the heavy paths that intersect 
Q and query them individually. For each such heavy path P we identify the bottommost vertex 
p G P Ci Q. The query for P has to report the vertices a & P such that (xi(a), X2(a)) > {xi{p),j). 
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As mentioned in Section [4. 11 Cartesian trees can report these vertices in constant time per vertex. 
Since Q intersects O(logn) heavy paths the total query time is 0(logn + k). 

A. 2 Two Trees 

With the heavy-path decomposition method we can get an efficient join-reachabihty structure when 
one of the two trees is an out-tree. Without loss of generality we assume that G\ is an out-tree. 
We perform the heavy-path decomposition of T as earlier and associate with each heavy path P 
a secondary data structure Dp\ the choice of the secondary structure depends on the type of G2. 
Also for each vertex a € P we store hp{a), the height of a in P. Given a query vertex b we want 
to report the ancestors a of 6 in T that reach 6 in G2. Let Q be the path in T from the root to 
b. Our algorithm queries the structure Dp for each heavy path P that intersects Q. For each such 
heavy path P we identify the bottommost vertex p £ P Ci Q. If G2 is an out-tree then we need to 
report the vertices a € P that satisfy /2(^) ^ -^2(0) and hp{a) > hp{p). In this case, a suitable 
choice for Dp is a join-reachability structure for an out-tree and a path. Either of the two solutions 
we developed earlier (Sections 14.21 and lA.lj) achieves 0(log |P| -|- kp) reporting time (because here 
we need to locate b in Dp), where kp is the number of reported vertices on P. This results to an 
overall (n, log^ n-\-k) structure. For the case where G2 is an in-tree we need to report the vertices 
a £ P that satisfy /2(a) C l2{b) and hp (a) > hp{p). Here we choose Dp to be a join-reachability 
structure for an in-tree and a path. Using the geometry-based structure of Section 14.21 results to 
an overall (n, log^ n + k) structure. 

B Planar st- Graphs 

Here we consider the case where Gi is a planar st-graph [T7] and G2 is a dipath. A planar si- 
graph is planar acyclic digraph with a single source s and a single sink t, such that s and t are 
on the boundary of the same face. For these graphs Kameda |12| gave an 0(n)-space structure 
that answers reachability queries in constant time. His algorithm performs two modified depth-first 
searches and assigns to each vertex a two integer labels ii{a) and ^2(0) both in the range [l,n]. 
Kameda then shows that these labels satisfy the property that a ~^Gi b if and only if £i{a) < ii{b) 
and ^2(1) < ^2(^)- Our data structure for the join-reachability problem also assigns each vertex a 
a third label £3(0) equal to the rank of a in G3. Now each vertex corresponds to a point in a three- 
dimensional rank space and a -^j b if and only if (^i (a), £2(0), ^3 (a)) < (^1(6), £2(fe); ^3(^))- Using a 
three-dimensional dominance structure we can get an (n, log n -|- A;) join-reachability structure 
With minor adjustments we can get an efficient data structure for the more general class of spherical 
st-graphs |17j . which are planar st-graphs without the requirement that s and t appear on the 
boundary of the same face. Tamassia and Tollis [17] showed how to reduce the reachability problem 
on these graphs to a reachability problem on planar st-graphs. 
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C Lattices 



Let (<,V^) be a partial order. An element z G y is an upper hound of x, y G y if x < z and 
y < z. If z is an upper bound of x, y and moreover z < w for all upper bounds w oi x,y then z 
is a least upper bound of x, y. Similarly, if z < x and z < y then z is a lower bound of x, y, and 
ii w < z for all lower bounds w of x,y then z is a greatest lower bound of x,y. A partial order 
(<, V) is a lattice if any two x,y £ V have both a least upper bound and a greatest lower bound. 
A partial lattice (<, V) is a partial order that can be extended to lattice by adding elements s and 
t such that s < x and x < t for any x G y. Any acyclic digraph G = {V, A) has an associated 
partial order Pq = {<,V) such that for u,v £ V , u < v if and only if u v. We say that G 
satisfies the lattice property if and only if its associated partial order is a lattice. For this class 
of digraphs, Talamo and Vocca presented an 0(ny^)-space structure that answers reachability 
queries in constant time [16]. Their structure is also capable of reporting the predecessors of a 
query vertex in 0{k) time. In this section we show how their structure can be extended in order to 
support efficient join-reachability. Roughly speaking, the Talamo- Vocca structure represents G as 
a collection of disjoint clusters with 0{^/n) vertices each. Moreover, we can assume that there are 
@{^/n) clusters; refer to [TFj for details. Each cluster G has a root vertex c and consists of either 
a subset of the predecessors of c, in which case it is an in- cluster, or of a subset of the successors 
of c, in which case it is an out-cluster. A vertex x G C is an internal vertex of G; a vertex x C 
that either reaches or is reachable from a vertex in G is an external vertex of G. External vertices 
have the following key property: If x is an external vertex that reaches (resp. is reachable from) 
a subset 5 C C then 5 contains the greatest lower bound (resp. least upper bound) of S, which 
is the representative of x in G. Now each vertex x is associated with a subgraph G{x) consisting 
of two trees rooted at x; an internal spanning tree I(x) and an external spanning tree E{x). If 
the cluster G containing x is an in-cluster then the internal tree is an in-tree that contains the 
predecessors of x in C and the external tree is an out-tree that contains the external vertices of G 
with X as their representative. Similarly, if the cluster G containing x is an out-cluster then the 
internal tree is an in-tree that contains the successors of x in G and the external tree is an in-tree 
that contains the external vertices of G with x as their representative. In order to be able to report 
all the predecessors of a query vertex h this data structure can explicitly store the predecessors of 
each vertex x that are located in the same cluster with x. Since each cluster has 0{y/n) vertices 
the data structure still occupies 0{n^yn) space. The predecessors of b outside its cluster are the 
predecessors of the vertices that are representatives of b in other clusters for which b is an external 
vertex. 

We can easily enhance the above structure so that it supports efficient join-reachability. We 
demonstrate this first for the case where G2 is a dipath. For each vertex x we construct a list 
Li(x) of the internal predecessors of x sorted in increasing rank in G2. Also we keep track of the 
minimum rank in G2 of the vertices in Li(x). Then we construct another list L2{x) which contains 
the representatives of x in the clusters where x is an external vertex. Furthermore, y G L2{x) 
only if the minimum rank in Li{y) is less than the rank of x. Now in order to report the vertices 
reaching b in the join-reachability graph, we report the vertices in Li(a) with rank less than b, for 
all a G L2{b) U {b}. Notice that we only visit clusters that contain a least one predecessor of b. 
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Therefore the reportmg time is 0{k). 

Now we show how the same bounds are achieved when G2 is a rooted tree. Let hia) = 
[52(0)) ^2(0)] be the depth- first search interval assigned to each vertex a in where T2 is the 
undirected version of G2. For each vertex x we construct a structure D{x) that contains the 
vertices in Li{x). In order to report the vertices that reach b in the join-reachabihty graph, we 
query the structure D{a) for all a G L2{b) U {b}. This structure reports the vertices 7 S Li{a) that 
satisfy hib) C 72(7) if G2 is an out-tree, or 12(7) ^ hib) if G2 is an in-tree. Note that during the 
construction of the join-reachability data structure we can ensure that a € L2{b) U {b} only if the 
answer of D{a) to query b is nonempty. Finally, we need to specify how D{a) operates. If G2 is 
an out-tree then D{a) stores the intervals 12(7) for all 7 G Li(a); a query asks for those 7 G Li{a) 
such that 12(7) contains the point 52(6). Otherwise, when G2 is an in-tree, D{a) stores the points 
52(7) for all 7 G Li{a); now a query asks for those 7 G Li{a) such that 52(7) is contained in hib). 
Such queries can be answered optimally by Chazelle's interval overlap structure [3], which gives us 
the desired result. 

Theorem C.l. Given a lattice Gi and an unoriented tree G2 with n vertices we can construct an 
{n^/n, k) join-reachability data structure. 
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